Smoothed Analysis of Renegar's Condition Number for Linear 

Programming 



CO , John Dunagan* Daniel A. Spielman t 

q ! Microsoft Research Department of Mathematics 

CN \ Massachusetts Institute of Technology 

O ■ Shang-Hua Teng * 

Department of Computer Science 
; Boston University and 

Akamai Technologies Inc. 



00 

q 

O 



February 1, 2008 



CN| ■ Abstract 
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We perform a smoothed analysis of Renegar's condition number for linear programming. 
In particular, we show that for every n-by-d matrix A, n-vector b and c£-vector c satisfying 
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0(log(nd/a)), where A, b and c are Gaussian perturbations of A, b and c of variance a 2 . 
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H" 0(n 3 log{nd/a)). 
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1 Introduction 



In |ST03b| , Spielman and Teng introduced the smoothed analysis of algorithms as an alternative 
to worst-case and average-case analyses in the hope that it could provide a measure of the 
complexity of algorithms that better agrees with practical experience. The smoothed complexity 
of an algorithm is the maximum over its inputs of the expected running time of the algorithm 
under slight perturbations of that input. In this paper, we perform a smoothed analysis of 
Renegar's condition number for linear programs, and thereby obtain a smoothed analysis of 
his interior-point algorithm. Interior point algorithms for linear programming are exciting both 
because they are known to run in polynomial time |Kar84j in the worst case and because they 
have been used to efficiently solve linear programs in practice. In fact, the speed of interior 
point methods in practice is much better than that proved in their worst-case analyses |IL941 
LMS90, EA96 . This discrepancy between worst-case analysis and practical experience is our 
main motivation for studying the smoothed complexity of interior point methods. 

Our main result is that the smoothed value of Renegar's condition number, to be defined in 
Section \l.2\ is 0(log(nd/a)). That is, for each (A, b, c) and a < l/^/dn, 

E__ [C(A,b,c)]=OQog{nd/o)), 

(A,b,c)*-J\f((A,b,c),a) 

where Af((A, b, c),cr) is the distribution of Gaussian perturbations of (A, b, c) of variance a 2 , 
and (A, b, c) <— M((A, b, c),a) indicates that (A, b, c) is chosen according to this distribution. 
As Renegar's algorithm |Ren95bj takes O (y/n In (C(A, b, c)/e)) iterations to find a solution of 
relative accuracy e, we find that the smoothed complexity of Renegar's algorithm when it is 
asked for a solution of relative accuracy e is 0(n 3 \og(nd/ae)). 

As explained in ST08EJ, when one combines this analysis with the smoothed analysis of the 
finite termination procedure in that paper, one obtains an interior point algorithm that returns 
the exact answer to the linear program and has smoothed complexity 0(n 3 log(nd/a)). In 
comparison, the best-known bound on the worst-case complexity of any linear programming 
algorithm is Vaidya's |Vai90| bound of 0((n + d)d 2 + (n + d) l ^d)L), and the best known bound 
for an interior point method is 0(n 3 L), first due to Gonzaga Gon88 . 

1.1 The Complexity of Linear Programming Algorithms 

A linear program is typically specified by a matrix A together with two vectors b and c. If A is 
an n by d matrix, then b is an n- vector and c is a d- vector. There are several canonical forms of 
linear programs specified by (A, b, c). The following are four commonly used canonical forms: 



max c x s.t. Ax < b 
max c T x s.t. Ax < b, x > 
max c T x s.t. Ax = b, x > 

find x / s.t. Ax < 



and its dual 



and its dual 



and its dual 



and its dual 



min b y s.t A y = c, y > 
min b T y s.t. A T y > c, y > 
min b T y s.t. A T y > c 



find y / s.t. A T y = 0, y > 



(1) 
(2) 
(3) 
(4) 
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Without loss of generality, we assume that n > d for the remainder of the paper. The worst-case 
complexity of solving linear programs has traditionally been stated in terms of n, d, and L, 
where L is commonly called the "bit-length" of the input linear program, but is rarely defined 
to actually be the number of bits necessary to specify the linear program. For integer A, b, c, 
Khachiyan KhafHl and Karmarkar Kar84 a defined L to be some constant times 

log (largest absolute value of the determinant of any square sub-matrix of A) 

+ log(||c|U + log(||6|U + log(n + d). 

In the smoothed model, complexity estimates in terms of L are quite pessimistic: even if one 
perturbs just the least significant digit of each entry of A, the resulting L value is at least 
some constant times d with high probability. Thus, in the smoothed model, our analysis of the 
complexity of interior point methods the replaces L, which is typically f2(c(), with \og{nd/a). 



1.2 Renegar's Condition Number 

In |E,en95bl IR,en95a( IRen94j . Renegar defined the condition number C(A, b,c) of a linear pro- 
gram and proved that an interior point algorithm whose complexity was 0(n 3 log(C(^4, b, c)/e)) 
could solve a linear program to relative accuracy e, or determine that the program was infeasible 
or unbounded. 

For a linear program in the canonical form (^Q), we follow Renegar Rcn94 ( Ren95a, Ren95E] in 
defining the primal condition number, Cp^ (A, b), of the program to be the normalized reciprocal 
of the distance to ill-posedness. A program is ill-posed if the program can be made both feasible 
and infeasible by arbitrarily small changes to the pair (^4, b). The distance to ill-posedness 
of the pair (A, b) is the distance to the set of ill-posed programs under the Frobenius norm. 
We similarly define the dual condition number, C^\a, c), to be the normalized reciprocal of 
the distance to ill-posedness of the dual program. The condition number, C^(A, b, c), is the 
maximum of C p (A, b) and Cp (A, c). 

We can equivalently define the condition number without introducing the concept of ill-posedness. 
For programs of form CD, we define C ( p ] {A, b) by 

Definition 1.2.1 (Primal Condition Number). 

(a) if Ax < b is feasible, then 

C V*(A b) = W a ^Wf 

p sup {5 : || AA,Ab\\ F < 5 implies (A + AA)x < (b + Ab) is feasible}' 

(b) if Ax < b is infeasible, then 

C W (A b) = H^ 4 ' b Wp 

p sup {5 : \\A A, Ab\\ F < 5 implies (A + A A) x < (b + Ab) is infeasible}' 

It follows from the definition above that Cp\A, b) > 1. We define the dual condition number, 
Cq\a, c), analogously. 
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To reader familiar with condition numbers in contexts outside of linear programming, the above 
definition may be surprising: the condition numbers for numerous other problems are defined 
as the sensitivity of the output to perturbations in the input, and are then often related to 
the distance to ill-posedness. Renegar inverts this scheme by defining the condition number for 
linear programming to be the distance to ill-posedness, and then proving that the condition 
number does bound the sensitivity of the output to perturbations in the input |Ren941 IRen 95a . 

Any linear program may be expressed in form (^Q); however, transformations among linear 
programming formulations do not in general (and commonly do not) preserve condition num- 
ber |Ren95aj . We will therefore have to define different condition numbers for each normal form 
we consider. For linear programs with canonical forms i ffl) . l|3* )l . and Q we define their condition 
numbers, C^ 2 \A, b, c), C^\A, b, c) and C^(A), analogously. We follow the convention that 
is not considered a feasible solution to (4). Just as for Cp\A, b), > 1 for all i. 

For linear programs given in form (2), Renegar RenQSEl IRen95a( IE,en94j developed an ini- 
tialization phase that returns a feasible point with initial optimality gap R < 0(nC(A, b, c)) 
for a linear program (A, b, c) or determines that the program is infeasible or unbounded, in 
0(n 3 \og(C(A, b, c))) operations. By applying 0(^/nlog(nC(A, b, c))/e) iterations of a primal 
interior point method, for a total of 0(n 3 log(nC(A, b, c))/e) arithmetic operations, Renegar 
proved: 

Theorem 1.2.2 (Renegar). For any linear program of form (2) specified by (A, b, c) and 

parameter e, Renegar's interior-point algorithm, in 0(n 3 log(nC(A, b, c)/e)) operations, finds a 
feasible solution x with optimality gap e\\A, b, c\\ F , or determines that the program is infeasible 
or unbounded. 

Subsequently, algorithms with complexity logarithmic in the condition number were developed 
by Vera |Ver96j for forms (1) and (3) and by Cucker and Peha ClMlQ for form (4). The 
complexities of their algorithms are similar to that of Renegar's. In |FV00| . Freund and Vera 
give a unified approach which both efficiently estimates the condition number and solves the 
linear programs in any of these forms. 

1.3 Smoothed Analysis of Condition Number: Our Results 

In this paper, we consider linear programming problems in which the data is subject to slight 
Gaussian perturbations. Recall that the probability density function of a Gaussian random 
variable with mean x and variance a 2 is given by 



A Gaussian perturbation of a vector x of variance a 2 is a vector whose ith element is a Gaussian 
random variable of variance a 2 and mean X{, and in which each element is independently chosen. 
Thus, the probability density function of a d-dimensional Gaussian perturbation of x of variance 



fi(x) 



1 



e ~(x-x) 2 /(2a 2 )_ 




a 2 is given by 




1 
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A Gaussian perturbation of a matrix may be denned similarly. 

For each (A, b, c) and a > 0, we let Af((A, b,c),o~) denote the distribution of Gaussian pertur- 
bations of (A, b, c) of variance a 2 , and we let (A, b, c) <— M((A, b, c), a) indicate that (A, b, c) 
is drawn from the distribution M((A, b, c),cr). 

Our main result, which is proved in Section^ is 

Theorem 1.3.1 (Smoothed Complexity of Renegar's Condition Number). For every 
n-by-d matrix A, n-vector b and d-vector c such that \\A, b, c|| „ < 1, every a < 1/Vnd, and 
every i £ {1, 2, 3, 4}, 



Pr 

A.b.c 



5a 2 



and 



E 

(A,b,c)*-Af((A,b,c),a) 



log (A, b,c) 



5a 2 



nd 

< 15 + 4.5 log — . 

a 



Theorem 11.3.11 implies a bound on the smoothed complexity of Renegar's algorithm as well as 
a bound on the smoothed complexity of the interior point methods that were developed for the 
other canonical forms. Note that in the statement of Theorem 11.3.11 we abuse the notation 
C^(A, b, c) for C^\A). The first bound of the theorem means that, with high probability, the 
condition number of a perturbed linear program is polynomial in n, d, and 1 /a. 

The following theorem follows immediately from Renegar's analysis (Theorem I1.2.2JI and the 
previous theorem. 

Theorem 1.3.2 (Smoothed Complexity of IPM). Let T((A, b, c), e) be the time complexity 
of Renegar's interior point algorithm for finding e- accurate solutions of the linear program defined 
by (A, b, c) or determining that the program is infeasible or unbounded. For every n-by-d matrix 
A, n-vector b and d-vector c such that 



\A,b,c\\ F < 1 and every a < 1/ynd, 



E 



(A,b,c)<-N{(A,b,c),a) 



[T((A,b,c),e)} =0(n 3 log 

a) ^ 



In order to analyze Renegar's condition number for the primal and dual of each of the four 
canonical forms, we found it necessary to develop several extensions to the theory of condi- 
tion numbers that may be of independent interest. For example, Lemma 12.2.21 generalizes the 
geometric condition on distance to ill-posedness developed in |GG01j by incorporating an ar- 
bitrary non-pointed convex cone that is not subject to perturbation, and this generalization is 
necessary for the application of our techniques. Additionally, Lemmas 12.3.21 



EU and EH all 

provide geometric conditions on the distance to ill-posedness whose import to us is on par with 
Lemma E221 



1.4 Organization of the Paper 



In our analysis, we divide the eight condition numbers C^) and Cq , for i G {1,2,3,4}, into 

(2) (3) 

C£ ; , C£', and with some additional work, 



two groups. The first group includes , C { p\ 



5 



Cp. The remaining condition numbers belong to the second group. We will refer to a condition 
number from the first group as a primal condition number and a condition number from the 
second group as a dual condition number. 

Section |2] is devoted to establishing a smoothed bound on the primal condition number. We 
remark that the techniques used in Section [2] do not critically depend upon A, b and c being 
Gaussian distributed, and similar theorems could be proved using slight modifications of our 
techniques if these were smoothly distributed within spheres or cubes. It follows from the result 
of Section |2 alone that Theorem 11.3. II holds for linear program given in Form (J2J). 

In Section |*3 we establish the smoothed bound on the dual condition number. Our bounds in 
this section do critically make use of the Gaussian distribution on A, b and c. 

In Section 01 we prove Theorem 1 1 . 3 . 1 1 using the smoothed bounds of the previous two sections. 
We conclude the paper in Section |S] with some open questions. 

In the remainder of this Section, we review some of the previous work on smoothed analysis, 
some earlier results on the average-case analysis of interior-point algorithms, and lower bounds 
on the complexity of interior-point algorithms. 

1.5 Prior Smoothed Analyses of Linear Programming Algorithms 

In their paper introducing Smoothed Analysis !ST03bJ, Spielman and Teng proved that the 
smoothed complexity of a two-phase shadow vertex simplex method was polynomial in n, d 
and 1/c. Shortly thereafter, Blum and Dunagan BD02.J performed a smoothed analysis of the 
perceptron algorithm for linear programming. They showed that the probability the perceptron 
algorithm would take more than a polynomial in the input size times k steps was inversely 
proportional to \f~k~. Their analysis had the advantage of being significantly simpler than that 
of ST03b , and it is their analysis that we build upon in this work. Blum and Dunagan's analysis 
used the fact that the number of steps taken by the perceptron algorithm can be bounded by the 
reciprocal of the "wiggle room" in its input, and the bulk of their analysis was a bound on the 
probability that this "wiggle room" was small. The "wiggle room" turns out to be a condition 
number of the input to the perceptron algorithm. 

1.6 Prior Average-Case Analyses of Interior Point Algorithms 

There has been an enormous body of work on interior point algorithms, some of which has 
addressed their average-case complexity. Anstreicher, Ji, Potra and Ye [A.TPY931 [AJPY99 , 
have shown that under Todd's degenerate model for random linear programs Tod91 , a ho- 
mogeneous self-dual interior point method runs in 0{y/n\ogn) expected iterations. Borgwardt 
and Huhn HB02 have obtained similar results under any spherically symmetric distribution. 
The performance of other interior point methods on random inputs has been heuristically an- 
alyzed through "one-step analyses", but it is not clear that these analyses can be made rigor- 
ous |Nem88l IUT921 IMT Y93j . 
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1.7 Lower Bounds for Interior Point Algorithms 

The best known lower bound on the complexity of interior point methods is r2(ra 1//3 ) iterations 
due to Todd |Tori94| and Todd and Ye |TY96| . However, the programs for which these lower 
bounds hold are very ill-conditioned. There are no known bounds of the form 0(n e ) for well- 
conditioned linear programs. It would be interesting to know whether such a lower bound can 
be proved for a well-conditioned program, or whether interior point algorithms always require 
fewer iterations when their input is well-conditioned. 

1.8 Notation and Basic Geometric Definitions 

Throughout this paper we use the following notational conventions. The material up to this 
point has obeyed these conventions. 

• lower case letters such as a and a denote scalars, 

• bold lower case letters such as a and b denote vectors, and for a vector a, dj denotes the 
ith entry of a. 

• capital letters such as A denote matrices, and 

• bold capital letters such as C denote convex sets. 

If Oi, . . . , a n are vectors, we let [oi, . . . , a n ] denote the matrix whose rows are the OjS. For a 
vector a, we let \\a\\ denote the standard Euclidean norm of the vector. We will make frequent 
use of the Frobenius norm of a matrix, ||A||^, which is the square root of the sum of squares 
of the entries in the matrix. We extend this notation to let \\A, x±, . . . , Xk\\p denote the square 
root of the sum of squares of the entries in A and in x±, . . . , xj~. Different choices of norm are 
possible; we use the Frobenius norm throughout this paper. The following proposition relates 
several common choices of norm: 

Proposition 1.8.1 (Choice of norm). For an n-by-d matrix A, 

l^lloo < \\A\\ F , and 
A\\ OP < \\A\\ F , 

where \\A\\ OP denotes the operator norm of A, max^o ^pjp- 

We let log denote the logarithm to base 2 and In denote the logarithm to base e. 
We also make use of the following geometric definitions: 

Definition 1.8.2 (Ray). For a vector p, let Ray (p) denote {ap : a > 0}. 



A 



dn 
Vd 



< 



< 
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Definition 1.8.3 (Non-pointed convex cone). A non-pointed convex cone is a convex set C 
such that for all x G C and all a > 0, ax G C , and there exists a vector t such that t T x < 
for all x G C . 

Definition 1.8.4 (Positive half-space). For a vector a we let H(a) denote the half-space of 
points with non-negative inner product with a. 

For example, M d and H(x) are not non-pointed convex cones, while {x : xq > 0} and Ray (p) 
are non-pointed convex cones. Note that a non-pointed convex cone cannot contain the origin. 
All of the cones that we introduce through the process of homogenization are non-pointed convex 
cones. 

These definitions enable us to express the feasible x for the linear program 

Ax > and x G C 

as 

n 

x g cnf]n(ai), 

i=i 

where a±, . . . ,a n are the rows of A. Throughout this paper, we will call a set feasible if it is 
non-empty, and infeasible if it is empty. Thus, we say that the set C n HILi "^( a «) i s feasible if 
the corresponding linear program is feasible. 
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2 Primal Condition Number 



In this section we show that the smoothed value of the primal condition numbers is polynomial 
in n, d, and 1/a with polynomially high probability. As in the work of Peha jPehOOj . we unify 
this study by transforming each canonical form to conic form. 

The primal program of form (1) can be put into conic form with the introduction of the homoge- 
nizing variable xq. Setting C = {(x,xo) '■ xq > 0}, the homogenized primal program of form (1) 
is 

[-A,b](x,x ) > 0, (x,x ) G C. 

By setting C = {(x,xq) : xq > and x > 0}, one can similarly homogenize the primal program 
of form (2). The dual programs of form (2) and form (3) can be homogenized by setting 
C = {(y,yo) ■ yo > 0} and C = {{y,yo) '■ Vo > and y > 0}, respectively, and considering the 
program 

[-A T ,c](y,y ) > 0, (y,y ) G C. 

We will comment on Cp^ below. Note that in each of these homogenized programs, the variables 
lie in a non-pointed convex cone. 

Peha [PehOO proves: 

Fact 2.0.1 (Preserving feasibility). Each of the homogenized programs is feasible if and only 
if its original program is feasible. 

In Section f2.11 we extend the notion of distance to ill-posedness and condition number to conic 
linear programs and note that the transformation by homogenization does not alter the distance 
to ill-posedness. The rest of the section will be devoted to analyzing the condition number of the 
conic program, and this will imply the bound on the condition number of the original program. 

2.1 Linear Programs in Conic Form and Basic Convex Probability Theory 

The feasibility problem for a conic linear program can be written: 

find x such that Ax > 0, x G C, 

where C is a non-pointed convex cone in M d and A is an n-hy-d matrix. Note that because C 
is a non-pointed convex cone, cannot be a feasible solution of this program. The following 
definition generalizes distance to ill-posedness by explicitly taking into account the non-pointed 
convex cone, C . 

Definition 2.1.1 (Generalized distance to ill-posedness). For a non-pointed convex cone, 
C , that is not subject to perturbation, and a matrix, A, we define p(A, C) by 

a. if Ax > 0, x G C is feasible, then 

p(A, C) = sup{e : ||Aj4|| f < e implies (A + AA)x > 0, x G C is feasible} ; 
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b. if Ax > 0, x £ C is infeasible, then 

p(A, C) = sup{e : ||AA|| F < e implies (A + AA)x > 0, x & C is infeasible} . 

We note that this definition makes sense even when A is a row vector. In this case, p(a, C) 
measures the distance to ill-posedness when we only allow perturbation to a. Even though trans- 
formations among linear programming formulations in general do not preserve condition number, 
Peha jPehflOj has proved that homogenization does not alter the distance to ill-posedness. For 
convenience, we will state the lemma for form (1), and note that similar statements hold for 

C p , C , and C • 

Lemma 2.1.2 (Preserving the condition number). Let 

max c T x s.t. Ax < b 
be a linear program. Let C = {(x,xq) : xq > 0}. Then, C F \A,b) = \\A, b\\ F / p([—A, b], C). 

The primal program of form (4) is not quite in conic form; to handle it, we need the following 
definition. 

Definition 2.1.3 (Pointed generalized primal distance to ill-posedness). For a convex 
cone that is not non-pointed, C , and a matrix, A, we define p(A, C) by 

a. if Ax > 0, 33^0, x £ C is feasible, then 

p(A, C) = sup{e : ||AA|| F < e implies {A + AA)x > 0, x / 0, x £ C is feasible} 

b. if Ax > 0, x ^ 0, x £ C is infeasible, then 

p(A, C) = sup{e : ||A^4|| F < e implies (A + AA)x > 0, x / 0, x £ C is infeasible} 

This definition would allow us to prove the analogs of Lemmas 12.1.41 and 12.1.51 for primal 
programs of form (4). We omit the details of this variation on the arguments in the interest of 
simplicity. 



The following two Lemmas are the main result of this section. To see how they may be applied, 

yp ^ and 



we note that a simple union bound over C y p> and C y n> using Lemma 12 . 1 . 41 yields Theorem II. 3. II 



for form (0). 

Lemma 2.1.4 (Condition number is likely polynomial). For any non-pointed convex cone 
C and a matrix A satisfying \\A~\\ F < 1? for a < \ j\fnd, 



Pr 

A^M(A,a) 



\\A\\ F 2 12 n 2 d L5 2 /2 9 n 2 d 1 - 

— — ^— > loe 2 

p(A,C) ~ 5a 2 & V Sa 2 



< 5. 



Lemma 2.1.5 (Smoothed analysis of log of primal condition number). For any non- 
pointed convex cone C and a matrix A satisfying \\A\\ F < 1, for a < l/y/nd, 



E 

A^jV(A,a) 



1 U 

log 



p(A, C) 



fid 

< 14 + 4.5 log — . 

a 
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We will prove Lemma 12.1.41 bv separately considering the cases in which the program is feasible 
and infeasible. In Section \2. 21 we show that it is unlikely that a program is feasible and yet can 
be made infeasible by a small change to its constraints (Lemma 12.2.1)) . In Section 12.31 we show 
that it is unlikely that a program is infeasible and yet can be made feasible by a small change to 
its constraints (Lemma 12.3. lj) . In Section \2. 4) we combine these results to show that the primal 
condition number is polynomial with high probability (Lemma 12.1. 4|) . In Section [2.51 we prove 
Lemma 12.1.51 

The thread of argument in these sections consists of a geometric characterization of those pro- 
grams with poor condition number, followed by a probabilistic argument demonstrating that 
this characterization is rarely satisfied. Throughout the proofs in this section, C will always 
refer to the original non-pointed cone, and a subscripted C (e.g., Co) will refer to a modification 
of this cone. 

The key probabilistic tool used in the analysis is Lemma l2.1.7| which we will derive from the 
following result of Bal93 . A slightly weaker version of this lemma was proved in |BD02j . and 
also in |BH.7fi| . 

Theorem 2.1.6 (Ball |Bal93j). Let K be a convex body inTR d and let /j, be the density function 
of a ftf(0, a) Gaussian random variable. Then, 



I » 

JdK 



Lemma 2.1.7 (e-Boundaries are likely to be missed). Let K be an arbitrary convex body 
in TR d , and let bdry(if, e) denote the e-boundary of K ; that is, 

bdry(if ,e) = {x : 3x' G dK , \\x - x'\\ < e} 

For any x G IR^, 

4ed 1 / 4 

Pr [x G bdry(if,e) \ K\ < , (outside boundary) 

x<—N(x,o) 0~ 

Aed 1 ^ 

Pr [x G bdry(if , e) D K] < (inside boundary) 

£<— Af(x,a) ' 0~ 

Proof. We derive the result assuming a = 1. The result for general a follows by scaling. 

Let \i denote the density according to which x is distributed. To derive the first inequality, we 
let K t denote the points of distance at most e from K, and observe that K e is convex. 

Integrating by shells, we obtain 

Pr [x G bdry(lf , e) \ K] < ( I n 

Jt=0 JdKt 

< e4d 1 / 4 , 

by Theorem 12.1.61 

We similarly derive the second inequality by defining K € to be the set of points inside K of 
distance at least e from the boundary of K and observing that K t is convex for any e. □ 
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In this section and the next, we use the following consequence of Lemma 12 . 1 . 71 repeatedly. 

Lemma 2.1.8 (Feasible likely quite feasible, single constraint). Let Co be any convex 
cone in TR d and, for any a E TR d , let a be a Gaussian perturbation of a of variance a 2 . Then, 

4ed 1 / 4 

Pr[Co n H(a) is feasible and p(a, Co) < el < , and 

a a 

4ed 1 / 4 

PrlCo nW(o) is infeasible and p(a, Co) < e] < . 

a a 

Proof. Let K be the set of a for which Co flW(o) is infeasible. Observe that p(a, Co) is exactly 
the distance from a to the boundary of K. Since K is a convex cone, the first inequality follows 
from the first inequality (the outside boundary inequality) of Lemma 12.1.71 which tells us that 
the probability that a has distance at most e to the boundary of K and is outside K is at most 
ied J . The second inequality similarly follows from the second inequality (the inside boundary 
inequality) of Lemma 12.1.71 □ 

2.2 Primal condition number, feasible case 

In this subsection, we analyze the primal condition number in the feasible case and prove: 

Lemma 2.2.1 (Feasible is likely quite feasible, all constraints). Let C be a non-pointed 
convex cone in TR d and let A be any n-by-d matrix. Then for any a > 0, 

Aend 5 / 4 

Pr [(Ax > 0, x G C is feasible) and (p(A, C) < e)] < . 

A^Af(A,a) cr 

To prove Lemma 12.2.11 we first establish a necessary geometric condition for p to be small. 
This condition is stated and proved in Lemma 12.2.21 In Lemma 12.2.61 we apply Helly's Theo- 
rem |LDK63j to simplify this geometric condition, expressing it in terms of the minimum of p 
over individual constraints. This allows us to use Lemma 12.1.81 to establish Lemma 12.2.91 which 
shows that this geometric condition is unlikely to be met. Lemma 12.2.11 is then a corollary of 
Lemmas l2~2~51 and l2~2~2l 

We remark that a result similar to Lemma 12 . 2 . 2 1 app ears in [CC01 . 

Lemma 2.2.2 (Bounding p by a max of min of inner products). Let C be a non-pointed 
convex cone and let a±, . . . , a n be vectors in IR d for which C D W(a,) is feasible. Then 

p(\ai , . . . , a n ], C) > max min afp. 

peCnn? =1 H(ai) i 
IHI=l 

Proof. Lemma 12.2.21 follows directly from Lemmas 12.2.51 12.2.31 and 12.2.41 below. These three 
lemmas develop a characterization of p, the distance to ill-posedness, in the feasible case. □ 

Lemma 2.2.3 (Lower bounding p by rays). Under the conditions of Lemma \2.2. 6 A 

p([ai,...,a n ],C)> max p([a x , . . . , a n ], Ray (p)). 
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Proof. Let Aa\, . . . , Aa n be such that C n ?Y(Aai + Oj) is infeasible. Then, for all p G 
C n f|i W(oi), Ray (p) n f|i W(Aa; + a*) is also infeasible. □ 

Lemma 2.2.4 (p of a ray as a min over constraints). For every set of vectors a±, . . . ,a n 
and p such that Ray (p) n f) { H(Oj) is feasible, 

p([oi, . . . ,a n ],Ray (p)) = minp(aj, Ray (p)). 



Proof. Observe that Ray (p) n Hi H(oj + Aoj) is feasible if and only if Ray (p) n H(cii + Aoj) 
is feasible for all i. □ 

Lemma 2.2.5 (p of a ray and single constraint as an inner product). For every vector 
a and every unit vector p, 

/o(o,Ray(p)) = \ a T p\ 

Proof. If a T p = 0, then p(a, Ray (p)) = 0. If a T p ^ 0, then Ray (p) n H(a) is feasible if and 
only if Ray (— p) n W(o) is infeasible; so, it suffices to consider the case where Ray (p) n Tt(a) 
is feasible. So, we assume a T p > 0, in which case Ray (p) n 'H(a) is feasible. We first prove 
that p(a,Ray (p)) > a r p. For every vector Aa of norm at most a T p, we have 

(a + Aa) T p = a T p + Aa T p > a T p - \\Aa\\ > 0. 

That is, p E TL{a+Aa). As this holds for every Aa of norm at most a T p, we have p(a, Ray (p)) > 

T 

a p. 

To show that p(a,Ray (p)) < a T p, note that setting Aa = — (e + a T p)p, for any e > 0, yields 

(a + Aa) p = a p + Aa p = a p — (e + a p)p p = a p — (e + a p) = — e; 
so, Ray (p) n H{a + Aa) is infeasible. As this holds for every e > 0, p(a, Ray (p)) < a T p. □ 

Lemma 2.2.6 (Bounding the max of min of inner products). Let C be a non-pointed 
convex cone and let a±, . . . , a n be vectors in lR d for which C n f\ TC(di) is feasible. Then 



max min afp > minp Oj, C n (| H(a-i) / d. 
P 6Cnn™ =1 W(ai) i * ' 11 ' / 

IIpII=1 



We will derive Lemma 12.2.61 from Lemmas 12.2.71 and 12.2.81 which we now state and prove. 

Lemma 2.2.7 (Quite feasible region implies quite feasible point, single constraint). 

For every a and every non-pointed convex cone Cq for which Co n TL{a) is feasible, 

p(a, Co) = max a T p. 

pec nn(a) 

\\p\\=i 
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Proof. The ">" direction follows from Lemmas 12.2.31 and 12.2.51 so, we concentrate on showing 

p(a, Cq) < max a T p. 

peC ryH(a) 

IIpII=i 

We recall that, as Cq is non-pointed, there exists a vector t such that t T x < for all x G Cq. 
We now divide the proof into two cases depending on whether aG Co. 

If a G Co, then we let p = a/ ||a||. It is easy to verify that 

a T p = \\a\\ = max a T p = max a T p. 

||p||=i peC nH(o) 

IIpII=i 

Moreover, Cq n H(a — (a + et)) is infeasible for every e > 0. So, p(a, Cq) < \\a\\. 

If a Co, let q be the point of Co that is closest to a. As Co H 7i(a) is feasible, q lies inside 
H(a) and is not the origin. Let p = qj ||g||. As Co is a cone, g is perpendicular to a — q. Thus, 

the distance from a to q is \J \\a\\ 2 — \\q\\ 2 = \J ||a|| 2 — (a T p) 2 , as a T p = \\q\\. Conversely, for 

any unit vector r £ Co, the distance from Ray(r) to a is yj \\a\\ 2 — (a T r) 2 . Thus, the unit 
vector r G Co maximizing a T r must be p. 

As Co is convex, there is a plane through q separating Co from a and perpendicular to the line 
segment a — q, and thus p(a, Cq) < \\q\\ = a T p. 

□ 

Lemma 2.2.8 (Quite feasible individually implies quite feasible collectively). Let Cq 

be a non-pointed convex cone and let Oi,...,o n be vectors in TR d . If there exist unit vectors 
Pi; Co, such that 

ajpi > e, for all i, and 
a JPj — 0) f or a ^ * an d 3 j 

then there exists a unit vector p G Co such that 

ajp > e/d, for all i. 

Proof. We prove this using Helly's Theorem LDK63 which says that if a collection of convex 
sets in M d has the property that every subcollection of d + 1 of the sets has a common point, 
then the entire collection has a common point. Let 

SSi = {x G C : afx/ \\x\\ > e/d}. 



We begin by proving that every d of the SSiS contain a point in common. Without loss of 
generality, we consider SSi, . . . , SSd- Let p = J2i=i Pi/d. Then, for each 1 < j < d, 
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As p has norm at most one, ajp/ \\p\\ > a,Jp, so p is contained in each of Si, . . . , Sd- 

As Co is non-pointed, there exists t such that t T x < 0, \/x G Co- Let SS| = SSif]{x : t T x = 
— 1}. Then, k G SSj implies —x/t T x G SS^. So, every <i of the SS^ have a point in common. 
As these are convex sets lying in a d — 1 dimensional space, Helly's Theorem tells us that there 
exists a point p that lies within all of the SS^s. As SS'i C SSj, this point lies inside all the 
SS-s. □ 



Proof of Lemma \2. 2.(\ For each i, we apply Lemma 12.2,71 to the vector and the cone C n 
flj^i ^( a j) to find a unit vector £ Cfl P|™ =1 W(oj) such that 



pfcii = p j Oj, C n P| H(flj 



As Pj G C n P|j 7~L( a j)i we a l so have 
for all j. Applying Lemma T2, 2. 81 we find a unit vector p£ Cfl fXj=i ^-( a j) satisfying 



pfaj > 



of p > min p flj, C n P| W(o,) / d, 



for all i. 



□ 



Lemma 2.2.9 (Max of min of inner products is likely large). Let C be a non-pointed 
convex cone in TR d and let a±, . . . , a n be vectors in lR d . Let oi, . . . , a n be Gaussian perturbations 
of ai, . . . ,a n of variance a 2 . Then, 



Pr 



C n[)W(a,) is feasible and max min aj p < e 
1 .' V ' P eCnnr =1 WK) i 1 y 

llpll=l 



< 



4end 5 / 4 



Proof. By Lemma 12.2.61 



Pr 



C n(]H(ai) is feasible and max mina^p < e 

• P 6Cnn? =1 WK) i 

IIpII=i 



< Pr 



C H P"H(Oj) is feasible and minp o i5 Cn p| W(Oj) < de 
Applying a union bound over i and then Lemma l2.1.81 we find this probability is at most 



i=l 



C n P W(Oj) is feasible and p a i5 C n Q W(a,) J < de 



< ^ 4(ed)d 1 / 4 _ 4neri 5 / 4 



(7 



□ 
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Proof of Lemma \2.2.1\ Follows immediately from Lemmas 12.2.21 and 12.2.91 



□ 



This concludes the analysis that it is unlikely that the primal program is both feasible and has 
small distance to ill-posedness. Next, we show that it is unlikely that the primal program is 
both infeasible and has small distance to ill-posedness. 



2.3 Primal number, infeasible case 

The main result of this subsection is: 

Lemma 2.3.1 (Infeasible is likely quite infeasible). Let C be a non-pointed convex cone 
in ]R a! and let A be any n-by-d matrix such that \\A\\ F < 1. Then, for any < o < \j\fd~ and 
e < 1/2, 

3fi1 fn 2 d 1 - 5 ]c>p- 1 - 5 ('\ I A 

Pr [(Ax > 0, x G C is infeasible) and (pi A, C) < e)l < iog { i/t) ^ 

A*~J\f(A,a) cr z 



To prove Lemma 12.3.11 we consider adding the constraints one at a time. If the program is 
infeasible in the end, then there must be some constraint, which we call the critical constraint, 
that takes it from being feasible to being infeasible. Lemma 12.3.21 gives a sufficient geometric 
condition for the program to be quite infeasible when the critical constraint is added. We then 
prove Lemma 12.3.1 1 bv showing that this condition is met with good probability. The geometric 
condition is that the program is quite feasible before the critical constraint is added and that 
every previously feasible point is far from being feasible for the critical constraint. 

Lemma 2.3.2 (The feasible-to-infeasible transition). Let C be a non-pointed convex cone 
in ]R d , p G C be a unit vector, and a\, . . . , a^+i be vectors in M rf such that 

afp > a, for 1 < i < k, and 
al+i x < -A f or all x e C D r\i =1 H(ai), \\x\\ = 1. 

Then, 

p([ai,...,a k+1 ) , C) > mini^ ^1 . 

[2 4a + 2 \\a k+1 \\ J 

We will derive Lemma 12 . 3 . 21 from the following geometric lemma. 

Lemma 2.3.3 (p bound on inner product). Let C be a non-pointed convex cone and let a 
be a vector for which C nH(o) is infeasible. Then, 

max p T a < —p(a, C). 
P eC,\\p\\=i 

Proof. Let p be the unit vector in C maximizing p T a. If we set 

Aa = (e — p T a) p, 
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for any e > 0, then we can see that C n 7i(a + Acs) is feasible from 

p T (a + Ac) = p T a + (e — p T a) p T p 
= p T a + (e - p T a) 
= e. 

So, we may conclude p(a, C) < \p T a\. □ 

Proof of Lemma \2.3. 6 A The conditions of the lemma imply that C Pi C\i=i 7~L( a i) is infeasible. 
So, we may prove the lemma by demonstrating that for all e satisfying 

e < a/2, and (5) 

< P (ff\ 

E < 4 + 2||a fe+1 ||/a' 1 J 

and all {Aoi, . . . , Aa^+i} satisfying || Aaj|| < e for 1 < i < k + 1, we have 

k+l 

C n P| H{ai + Aoj) is infeasible. 

i=i 

Assume by way of contradiction that 

fe+i 

C n P| H{ai + Adi) is feasible. 

i=l 

Then, there exists a unit vector x' G C n Di=i + Aci). We first show that 

k 

x' + - P e Cnf]H(a t ). (7) 



i=l 



To see this, consider any i < k and note that (ttj + Aoj) a;' > implies 

af x' > -Aajx' > - ||Aa;|| \\x'\\ > -e. 

Thus, 



a- he' + -p = af a;' + of -p > -e + -a > 0. 

V a / a a 

To finish our proof of Q, we observe that x' £ C and p £ C imply x' + € C. 

Let a: = a;' + ^p. Then a; G C n fli=i 7~L{ a i) an d a; has norm at most 1 + e/a and at least 
1 — e/a. To derive a contradiction, we now compute 

(ofc+1 + Ao fe+ i) T a; / = (a k+1 + Aa k+1 ) T (x - (e/a)p) 

= al +l x + Aof +1 a; - (e/a)of +1 p - (e/a)Aof +1 p 

< -P\\x\\ + ||Ao fc+ i|| ||a;|| + (e/a) \\a k+1 \\ + (e/a) ||Ao fc+ i|| 

< -0(1 - e/a) + e(l + e/a) + (e/a) ||o fc+1 1| + (e 2 /a) 
= -13(1 - e/a) + e((l + e/a) + ||o fc+ i|| /a + e/a) 

< -/3/2 + e(2+||o fe+1 ||/a), by © 

< by ®, 
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which contradicts x' £ C n n*=i TL(ai + Ao»). □ 

We now prove that the geometric condition of Lemma 12 . 3 , 21 holds with high probability. First, 
we establish two basic statements. 

Proposition 2.3.4. For positive a, (3 and any vector a k +i, 

a/3 ( a/3 (3 

> mm • 



2a + ||ofc + i|| L2 + ||afc + i|| 2 + ||ofc+i| 

Proof. For a > 1, we have 

a/3 (3^ /3 



2a+[|ofc + i|| 2 + [|ajfe+i[| /a 2+||afc+i| 
while for a < 1 we have 

a/3 a/3 



2a + ||afc + i|| 2 + ||afc+i| 



□ 



Proposition 2.3.5. If C n Hi=i ^( a «) * s infeasible, then 

p([ai, . .. ,a k ], C) < p([ai,... ,a n ], C) . 

Proof. Adding constraints cannot make it easier to change the program to make it feasible. □ 

Proof of Lemma \2. H. 11 Let a±, . . . ,a n be the rows of A, and let 

k 

C = C and C k = C n(~)H(a k ). 

i=l 

Note that C n is the final program. Let E k denote the event that C k -i is feasible and C k is 
infeasible. Using Proposition 12.3.51 and the fact that C n infeasible implies that E k must hold 
for some k, we obtain 

Pr [C n is infeasible and p ([ai, ■ ■ ■ , a n ], C) < e] 

71-1 



< ^2 Pr [E k+ i and p ([d, . . . , o„], C) < e] 

k=0 

71-1 

< Pr [#fe+l and p ([a 1; . . . , a fe+1 ], C) < e] . (8) 



fc=0 

If Ek+i occurs, then C k is feasible, and we may define 

k(oi, . . . , Ofc) = max min afp. 

pGC^ l<i<k 
IIpII=1 
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Then, E k+ i implies 



Lemma 12.3.31 implies 



&i P > • • • , Ofe), for 1< i < i, and 



fflfc+ia; < -p(tt fc+ i, C fe ) for all x G C fc , = 1. 



So, we may apply Lemma 12.3.21 and Proposition 12.3.41 to show that E k +% implies 

«(<*!,..., a fc ) k(oi, . . . , a fc )p(a fc+ i, C k ) p{a k+ \,C k ) 



p([ai, a k+1 ], C) > min 



> 



2 ' 4 + 2||o fc+ i|| '4 + 2[|ofc+i|| 

min{K(oi,...,o fc ), k(oi, . . . , a fc )p(a fc+ i, C k ), p(a k+1 ,C k )} 

4 + 2[|ofc+i|| 



(9) 



We now proceed to bound the probability that the numerator of this fraction is small. 
We first note that 

«(oi, . . . , a k )p(a k+ i, C k ) < X 

implies that either n(ai, . . . , a k ) < A, p(a k+ i,C k ) < A, or there exists an I between 1 and 
|~log(l/A)] for which 

«(oi, • • • , o fe ) < 2 -I+1 and p(a fc+ i, C fe ) < 2 Z A. 
We apply Lemma Fi. 1.81 to bound 

4A^/4 



Pr [E k+ i and p(a k +i, C k ) < A] < 
and Lemma 12.2.91 to bound 

4An<i 5 / 4 

Pr [C/c is feasible and k(oi, . . . , o^) < A] < . 

ai,...,a fc a 

So, for 1 < I < [log(l/A)] , we obtain 

E k+1 and «(oi, . . . , a k ) < 2~ m and p{a k+1 , C k ) < 2 l X 
C fc / and k(oi, . . . , o fc ) < 2~ m 
Cfc+i = and p(a k+ i, C k ) <2 l X \ C k ^% and «(oi, . . . , a k ) < 2 



(10) 



(11) 



Pr 



Pr 



Pr 



-2+1 



< Pr 



< 



C fe / and k(oi, . . . , a k ) < 2~ m 
2- l+1 4nd 5 / 4 2 f 4Ad 1 / 4 



a 



, by C 



a 

32Xnd 1 - 5 



a 



, by CCD, 



<7^ 
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Summing over the choices for I, we obtain 

Pr[£ fc+ i and min{«(ai, . . . , a k ), «(ai, . . . , a k )p(a k+ i, C k ),p(a k+1 , C k )} < A] 



4And 5 / 4 4Ad 1 / 4 ri , /x , n 32And L5 

< + — + riog(i/A)i 



< A 



< A 



Und 3 / 4 + 4 + 32 [log(l/A)l nd 1 - 5 ' 



<7 Z 



by o- < 1/y/d, 



(32 flog(l/A)l + 8 W 



1.5 



<7^ 



(12) 



This concludes our analysis of the numerator of Q. We can bound the probability that the 
denominator of is small by observing that a k +i is a Gaussian centered at a point a k +i of 
norm at most 1; so, Corollary IA.0,31 implies 



Pr 



4 + 2 ||afc + i|| > 6 + 2ayj2d\n{e/e) 



< e. 



(13) 



< e 



We now set A = e(6 + 2cr^/2dln(e/e)) and observe that if we had 

min{K(oi, . . . ,o fc ), k(oi, . .., a k )p(a k+1 , C k ), p(a k+1 ,C k )} 

4 + 2||o fc+ i|| 

this would imply 

min{«(ai, ...,ajfe), «(ai, . . . ,o fc )p(o fc +i, C fc ), p(a fc+ i, C fc )} < A, or 

4 + 2 Hajfc+ill > 6 + 2<r v /2dln(e/e). 

So, we may apply (fT2l and ()13j) to obtain 
Pr andp([o 1 ,...,aj fc+ i], C) < e] 

'(32 [log(l/e(6 + 2<r v /dlog(e/e))) + 8)nd 1 - 5 ' 

< e + e(6 + 2ay / 2dln(e/e] 1 ' 



(7 
jl.5 



< e + e (6 + 37hu>M) ( (32 r to 6(V(fr))1 + 8)nrf j ? ^ ^ < ^ ^ ^ ^ ^ 



(7 



< e + e Vln(e/e) 



(32 [log(l/(6e))l + 8)rad 1 - 5 



(7 



/ 3601og L5 (l/e)nrf L5 

— ^ + ^ I o 



(T- 



since ( V / ln(e/e))([log (l/6e)] + 1/4) < log L5 (l/e) for e < 1/361, 
/3611og L5 (l/e)nd L5 ' 



< e 



(T- 
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Plugging this in to (JSJ), we get 

Pr [Co is infeasible and p ([oi, . . . , a n ], C) < e] < 



□ 



2.4 Primal condition number, putting the feasible and infeasible cases to- 
gether 



We combine the results of Sections 12. 21 and !2.3l to prove Lemma [2, 1.41 which says that the primal 
condition number is probably low. 



Proof of Lemma \2.1.J\ In Lemma 12.2.11 we show that 

Pr [{Ax > 0, x e C is feasible) and (p(A, C) < e)] < 



4end 5 / 4 



a 



while in Lemma 12.3.11 we show 



Pr [(Ax > 0, x e C is infeasible) and (p(A, C) < e)] < 



361elog L5 (l/e)n 2 d 1 - 



o~ 



Thus, 



Pr [p(A, C)<e] = Pr [(Ax > 0, x £ C is feasible) and (p(A, C) < e)] 

+ Pr [(Ax > 0, x G C is infeasible) and C) < e) 

4end 5 / 4 Seielog^tl/eWd 1 - 5 

_ + V ; 



a 



< 



365elog L5 (l/e)n 2 d 1 - 



(14) 



Setting e = 5 /(3a log 1 ' 5 (a/ 5)) where a = 365 "^ - (note that this satisfies e < 1/2), we obtain 



2 d 1.5 



Pr 



1 



p(A, C) 



> 



1100 n 2 d 



2J1.5 



log 



1.5 



2j1.5 



365 n^d 
5o^ 



< 



^log 15 (flog 15 (f)) 



Salog 1 - 5 ^) 



■ 0.74 5, 



(15) 



as a/5 > 365. 

At the same time, Corollary IA.0.31 tells us that 

Pr 



I^Hf ^ l + o-V nd 21n(4e/5) < 5/4. 
The lemma now follows by applying this bound, a < l/\Aid, and (j!5j) . to get 

'365 n 2 d L5 



Pr 



\A\\ F > (1+ v / 21n(4e/5))1100 n 2 rf L5 L5 



5a 2 



5(7 2 



< (0.74 + 0.25)5 < 5 
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To derive the lemma as stated, we note 



(1 + v / 21n(4e/5))1100 n 2 d L5 i ^ 5 [365 ra 2 d L5 ^ < 2 12 ra 2 d L5 1 ^ (2 9 n 2 d 1 ^ 



5a 2 



log 1 



Sa 2 



5a 2 



log z 



5a 2 



□ 



2.5 Log of the Primal Condition Number 

In this section, we prove we prove Lemma 12.1.51 



Proof of Lemma \2.1.b\ First notice that 

IAIL 



E 



log 



p(A, C) 



E 



log imi P + lop 



i 



p(A, C) 



We first focus on E [log ||A||^]. Because logarithm is a convex function, we have 



E [log \\A\\ F ] < log(E [||^4|| F ]) < log WE 



\A\\ 



As ||A||^ is a dn-dimensional non-central x 2 random variable with non-centrality parameter 
||A|| F , its expectation is nd + ||A||„ AS70j 26.4.37]. Therefore, 

E [log \\A\\ F ] < log Vnd+ 1. 

We will use the following simple fact which is easy to verify numerically: 
Fact 2.5.1. For all a > 100 and x > 2 log a, x — 1.5 log x > x/2. 



Let 



365n 2 d 



a 



o~ 



as before. By Equation (|14|) in the proof of Lemma 12.1.41 



Pr 



1 



a log 1 ' 5 x 



> x 



< 
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Therefore, 



E 



log 



p(A, C) 



Pr 



Pr 



log 



1 



p(A, C) 
1 



> x 



dx 



< 



< 



p(A, C) 

1.5 



o 



2 log a 



dx 



, 00 ax 1 - 5 J 
dx + I — —dx 



min ( 1, — — ) dx 



2 log a c ' 



2 log a + a 



< 2 log a + a 



e -x+X.S\ozx dx 



2 log a 

oo 

2 log a 



< 2 log a + 2, 

where the second-to-last inequality follows from Fact 12.5.11 



Thus, 



E 



log 



p(A, C) 



E 



log 11^ + log 



1 



p{A, C)\ 



< log Vnd+1 + 2 log a + 2 



< 14 + 4.5 log — . 

a 



□ 
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3 Dual Condition Number 

In this section, we consider linear programs of the form 

A T y = c, y>0. 

The dual program of form (1) and the primal program of form (3) are both of this type. The dual 
program of form (4) can be handled using a slightly different argument than the one we present. 
As in section |21 we omit the details of the modifications necessary for form (4) . We begin by 
defining distance to ill-posedness appropriately for the form of linear program considered in this 
section: 

Definition 3.0.1 (Dual distance to ill-posedness). For a matrix, A, and a vector c, we 
define p(A, c) by 

a. if A T y = c, y > is feasible, then p(A, c) = 

sup {e : || Aj4|| f + || Ac]^ < e implies (A + AA) T y = e + Ac, y > is feasible} 

b. if A T y = c, y > is infeasible, then p(A, c) = 

sup {e : ||AA|| F + || Ac||^ < e implies (A + AA) T y = c + Ac, y > is infeasible} 



The main result of this section is: 



Lemma 3.0.2 (Dual condition number is likely low). Let A be an n-by-d matrix and c 
a vector in TR d such that \\A\\ F < 1 and \\c\\ < 1. Then for any a < 1/ynd, 



Pr 

(A,c)^M{{A,c),a) 



F 

\\A, c| 



F 50000 d 1 /4 n i/2 

> 2 l0 § 



200 d 1 /4„i/2 



< e. 



In addition, 



E 

{A,c)*-M((A,c),<r) 



log 



p(A, c) 



nd 

< 14 + 4 log — . 

a 



We begin by giving several common definitions that will be useful in our analysis of the dual 
condition number (Section 13. lj) . We define a change of variables (Section 13. 2|) . and we then 
develop a sufficient geometric condition for the dual condition number to be low fSection l3.3|) . In 
Section T3.41 we use Lemma 13.2.31 to prove Lemma l3.().2| thereby establishing that this geometric 
condition is met with good probability. 



3.1 Geometric Basics 



Definition 3.1.1 (Cone). For a set of vectors ai, . . . , a n , let Cone (oi, . . . , a n ) denote 
{x : x = J2i AjOj, Aj>0}. 
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Definition 3.1.2 (Hull). For a set of vectors a±, . . . , a n , let Hull (oi, . . . , a n ) denote 
{x : x = Y,i Aifli, Xi > 0, Y.i Ai = 1} ■ 

Definition 3.1.3 (Boundary of a set). For a convex set SS, let hdry(SS) denote the bound- 
ary of SS, i.e., {x : Ve > 0, 3e, ||e|| < e, s.t. x + e G 55, x - e ^ SS} . 

Definition 3.1.4 (Point-to-set distance). Let dist (;c, SS) denote the distance of x to SS, 
i.e., 

min{e : 3e, ||e|| < e, s.t. x + e G SS"} . 

Note that Cone (oi, . . . , a n ) is not a non-pointed convex cone, while Hull (oi, . . . , a n ) is the 
standard convex hull of {ai, . . . , a n }. 

3.2 Change of variables 

We observe that there exists a solution to the system A T y = c, y > if and only if 

c G Cone (oi, . . . , a n ) , 
and that for c ^ 0, this holds if and only if 

Ray (c) intersects Hull (oi, . . . , a n ) . 

In this Section, we need one technique beyond those used in Section |2] — a change of variables. 
We set 

n 

z = (1/n) Uj, and 
Xi = ai — z, for i = 1 to n — 1. 

For notational convenience, we let x n = a n —z, although x n is not independent of {z, x\, . . . , cc n _i}. 
We can restate the condition for the linear program to be ill-posed in these new variables: 
Lemma 3.2.1 (Ill-posedness in new variables). 

A T y = c, y > 0, c / is ill-posed if and only if z G bdry(Ray (c) — Hull (x%, . . . , x n )). 

Proof. We observe 

A T y = c, y > is feasible <^=^ Ray (c) intersects Hull (oi, . . . , a n ) 

<^=^ Ray(c) intersects z + Hull (xi, a; n ) 

<^ z G Ray (c) — Hull (x\, . . . , £C n ) . 

For c 7^ 0, Ray (c) — Hull (aji, . . . , x n ) is a continuous mapping from c,X\, . . . ,x n to subsets 
of Euclidean space, and so for z in the set and not on the boundary, a sufficiently small change 
to all the variables simultaneously will always leave z in the set, and similarly for z not in the 
set and not on the boundary. 
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To establish the other direction, we observe that if z is on the boundary, then can perturb z to 
bring it in or out of the set. Although z determined by the 01, . . . , a n , we can 

perturb the ffli, . . . , a n so as to change the value of z without changing the values of any of the 
xi, . . . , x n . This can be done because each Xi is a relative offset from the average z, while each 
Oj is an absolute offset from the origin; the proof of lemma 13.2. 21 below establishes formally that 
the change of variables permits this. 

The lemma is also true for c = 0, but we will not need this fact. □ 

Note that Ray (c) — Hull (asi, . . . , x n ) is a convex set. The following lemma will allow us to 
apply lemma 12.1 .71 to determine the probability that z is near the boundary of this convex set. 

Lemma 3.2.2 (Independence of mean among new variables). Let Oi, . . . ,a n be n vectors 
in M d . Let ai,...,a n be a Gaussian perturbation of a±, . . . , a n of variance a . Let 

z = — ttj and Xi = di — z, for 1 < i < n. 
n t-^ 



Then, z is a Gaussian perturbation of 



n 



of variance a jn and is independent of X\, . . . , x n . 

Proof. As z is the average of Gaussian perturbations of variance a 2 of n vectors a±, . . . , a n , it 
is a Gaussian perturbation of variance a 1 jn of the average of these n vectors, that is, of 

1 \ - - 



n 



a,. 



The vector z is independent of x%, . . . , x n because the linear combination of oi, . . . , a n used to 
obtain z is orthogonal to the linear combinations of Oi, . . . , Q n used to obtain the XjS. □ 

Lemma 3.2.3 (Mean is likely far from ill-posedness). Let Oi, . . . , a n be n vectors in TR d 
and c be a vector in TR d . Let a%, . . . , a n be a Gaussian perturbation of a\, . . . ,a n of variance a 2 
and let c be a Gaussian perturbation of c of variance a 2 . Let 

z = — a,i and Xi = ai — z, for 1 < i < n. 
n ^— ' 

Then, for all c and x±, . . . , x n , 

Sed 1 ^^/ 2 

Pr [dist (z,bdry(Ray (c) - Hullf^i, . . .,x n ))) < e] < . 

z a 

Proof. Let c be arbitrary. By Lemma 13.2.21 we can choose xx,...,x n and then choose z 
independently. Having chosen x%, . . . , x n , we fix the convex body Ray (c) — Hull (x±, . . . , x n ) 
and apply Lemma 12.1.71 twice: once for the inside e-boundary, and once for the outside e- 
boundary. □ 
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3.3 A geometric characterization of dual condition number 



We now give a geometric characterization of the dual condition number that uses both the 
original and the new variables. In the next section, we will use this characterization to prove 
Lemma 13.0.21 



in M d . Let 



Lemma 3.3.1 (Reciprocal of distance to ill-posedness). Let c and ai,...,a n be vectors 

z = — di and Xi = ai — z, for 1 < i < n. 
n z — ' 

i 

k\ = dist (z, bdry(Ray (c) — Hull (sci, . . . , x n ))) 
k 2 = \\c\\ 



Then 



< max 



8 4 24maxj \\ai 



p(A, c) i k\ k 2 k\ki 



Proof. By the definition of k\ and k 2 and Lemma l3.3.21 we can tolerate any change of magnitude 
up to in z, and X} , . . . . x n , and any change of up to 2fci+4(||z|[+max||a;il|) * n c without the 
program becoming ill-posed. We show that this means we can tolerate any change of up to fci/8 
in a L without the program becoming ill-posed. Formally, we need to show that if ||Aoj|| < k\/8 
for all i, then ||Az|| < ki/4 and ||Aa!j|| < fcj/4. Since Az = (l/n)X)Aoi, ||Az|| < ki/8. Since 
Axi = Aa,i - Az, \\Axi\\ < ki/8 + h/8 = fci/4. Thus 



p(A,c) >ndn{|,— 



k-\ k 



1«2 



+ 4(||z|| + max \\xi 
which implies 

. 4 8(||*|| + max Hajill 
< max < — , — , 



p(A, c) * ^ ki ' k 2 ' kik 2 
as 

2ki +4(||z|| +max||»i||) $ if ki > 4(||z|| + max ||a;i||), and 

7^ < j gfag^M otherwise . 

Since z = (l/n)^~Jaj implies < max||aj||, and xi = ai — z implies ||:e.;|| < ||aj|| + ||z|| < 
2 max || eii ||, we have 

, 8 4 24 max ||a,; 

< max 



p(A, c) { k\ ' k 2 ' k\k 2 



□ 



Lemma 3.3.2. (Geometric condition to be far from ill-posedness in new variables.) 

If 

dist (z, bdry(Ray (c) — Hull (x\, . . . , x n ))) > a (16) 
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and 



\Ax,i\\ < a/4, 
\\Az\\ < a/4, 

IIAcll < 



a c 



2a + 4(\\z\\ + maxj ac 4 



z + Az bdry(Ray (c + Ac) — Hull (x\ + Axi, ...,x n + Ax n )) 

Proof. Assume by way of contradiction that 

z + Az G bdry(Ray (c + Ac) — Hull (x\ + Aaji, . . . , x n + Aa: n )). 

We first consider the case that z G" Ray(c) — Hull {x\, . . . , x n ). In this case, we will show 
that dist (z, bdry(Ray (c) — Hull {x\, . . . , x n ))) < a, contradicting assumption (|16|). Since 
z + Az G bdry(Ray (c + Ac) — Hull (x\ + Ajci, . . . ,x n + Ax n )), 

z + Az = A(c + Ac) - li{ x i + Ax^, 

i 

for some A > and 71, . . . ,7 n > 0, Y^i 1% = 1- We establish an upper bound on A by noting that 



A 



\Z + Az + J2iJi( X i + Az,; 



||c + Ac|| 

We lower bound the denominator of ()17|) by ||c|| /2 by 



lAcll < 



a c 



2a + 4(||z|| + maxj \\xi 
We upper bound the numerator of (|17|) by 



< || c || /2. 



< ||z||+a/4 + 5^7 i (||a; < || + ||A 

i 

< \\z\\ + a/4 + max ||a3j|| + a/4 

i 

= \\z\\ + max ||a;j|| + a/2. 



X i 



(17) 



Thus, 



A < 



|z|| + maxj ||ajj|| + a/2 
IN /2 



Since 



^z + Az - AAc + ^ 7iAa;^ = ^Ac - ^ 7^^ G Ray (c) - Hull (a?i, . . . , x n ) , 
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we find that 



dist (z, bdry(Ray (c) — Hull (xi, . . . , x n ))) 



< 



Az - AAc + ^7iAa 

i 

< \\Az\\ + A ||Ac|| +y^7i \\Axi 



< a _|_ II Z H ^ max « a ll c ll "\ _|_ " 



4 V ||c||/2 J V 2a + K\\ z \\ + max i \\ x i\\)J 4 

= a, 

contradicting assumption (JT5|) . 

We now consider the case that 2; G Ray (c) — Hull (sci, . . . , se n ). Since 

z + Az e bdry(Ray (c + Ac) - Hull [x\ + Ax%, ... ,x n + Ax n )), 

there exists a hyperplane H passing through z + Az and tangent to the convex set 

Ray (c + Ac) — Hull [x\ + Ax\, ... ,x n + Ax n ). By the assumption that 

dist (z, bdry(Ray (c) — Hull (jci, x n ))) > a, there is some 5o > such that, for every 

5 G (0, So), every point within a + S of z lies within Ray(c) — Hull (xi, . . . , x n ). Choose 

5G (0, Sq) that also satisfies S < \\z\\+m&Xi\\xi\\. Let q be a point at distance ^+<5 from z + Az 

in the direction perpendicular to H. Since dist (z, z + Az) < j, and dist (z + Az, q) = ^f- + 5, 

q G Ray (c) - Hull (sci, . . . , £„) 

At the same time, 

dist (g,Ray (c + Ac) - Hull(aji + Ax x , ... ,x n + Ax n )) > (18) 

Because q G Ray (c) — Hull (x\, . . . , x n ), there exist A > and 71, . . . , 7„ > 0, J2i 7i = 1 such 
that 

q = Ac - ^7iXj. 

i 

We upper bound A as before, 

A= N + E^jj < ||*||+a+^ + max^|| < ||,||+max KII+q/2 

||c|| ~ ||c|| ~ l|c||/2 

Hence, 

q + AAc - 7iAa3j = A(c + Ac) - ^ ^i{xi + Axi) 

i i 

G Ray (c + Ac) - Hull (sci + Ax 1 , ...,x n + Ax n ) , 
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and thus 



dist (q, Ray (c + Ac) — Hull (x\ + Ax\, . . . , x n + Ax n )) < 



XAc — ^2 liAx, 



< A || Ac || + max || Ax{ 

i 

< a/2 + a/A 

< 3a/4, 



which contradicts (|18|) , 



□ 



3.4 Dual condition number is likely low 

Proof of Lemma \S. 0. 2L Let 

z = — a,i and X{ = Oi — z, for 1 < i < n, 
n 

i 

k\ = dist (^,bdry(Ray (c) - Hull ( X\, . . . , jc n ))) and /c2 — || c|| . 

We will apply the bound of Lemma 13.3.11 We first lower bound min{fci, &2, ^1^2}- We begin by 
observing that if 

then either 



min{fci, k 2 , hk 2 } < e, 
dist (z, bdry(Ray (c) — Hull (asi, . . . , a; n ))) < e, 



(19) 



or 



||c||<e, (20) 

or there exists some integer I, 1 < I < [log -] , for which 

dist (z,bdry(Ray (c) - Hull (xi,..., x n ))) <2 l e and ||c|| < 2"' +1 . (21) 

The probabilities of events (|19|) and 1)20(1 will also be bounded in our analysis of event (|21j) . By 
Proposition IA.0.^1 for d> 2, we have 



PrNlcll < el < 



ee 



(7 



which translates to 



while Lemma 13.2.31 implies 



Pr 



Ml < 2 



-1+1 



< 



e2 



-l+i 



a 



Pr 

z 



dist (z, bdry(Ray (c) - Hull (xi, x n ))) < 2 l e 



< 



8 • 2'ed 1 /4 n i/2 
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Thus, we compute 



„ r . r , 7 , , , , Sed^n 1 / 2 ee W e2~ l+1 8 • 2 l edV A n 1 / 2 
Pr minft,fc2,fciy<e < 1 h > 



a 



a * — ' <y 

1=1 



8 ed 1 /4 n i/2 e£ i 6eed i/4 n i/2 

+ - + 2 



log 



< 



55 ed 1 /4„i/2 



(7- 



log 



Setting 



we find that 



for 5 < 1. So, we obtain 



200dV 4 nV2log ( 20 °y /8 ) /a 2 ' 



^^lo g (i]<,/2, 



Pr 



min{ki,k 2 ,kik 2 } < 



which we re-write this as 

1 1 1 1 200 d 1 / 4 ™ 1 / 2 . /200 d 1 /4 n i/2 > 



Pr 



max 



> 



&r 2 



■log 



<*/2, 



5 

<2" 



By Lemma 13.3.11 we have 

1 



j 4 24maxj ad 

< max^—,— , — 

p{A, c) { k\ k 2 k\k 2 

< 24max(max [ladl , 1) max < — , — , 

i [ki k 2 k\k 2 



By Corollary IA.0.31 we have 



Pr 



max(||,4, c\\ F , 1) > 3 + ay/(d + l)n 21n(4e/<5) 



< 



and max(||^4, c\\ F , 1) > max(maxj ||aj|| , 1). 

From a union bound of inequalities 1221 and 1231 we obtain 



Pr 



\\A, c\\ F 24 • 200 d 1 / 4 n 1 /2 ^ f 200 d x ^n x l 2 
p(A, c) 5a 2 



log 



Sa 2 



(3 + ay/{d+ l)ra21n(2e/<5)) 2 
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The proof of the first part of the lemma now follows by computing 
24 • 200 d^n 1 / 2 , ( 200 d^n 1 / 2 ' 



5a 2 



log 



5a 2 



(3 + ay/(d+l)n 21n(2e/5)) 2 < 



50000 dV4 n i/2 / 200 d 1 /4 n i/2 > 
^ l0g 



<5<r 2 y ' 
where we used the assumption a < 1/ vdra. 

We now establish the smoothed bound on the log of expectation. Note that 



E 



log 



p(A, c) 



= E[logp,c|| F ] + E 
< E[log||A,c|| F ] + E 



log 



1 



p(A, c) 



i , 1 1 1 

log max < 7-,-p, 



< E[logmax(||A,c|| F ,l)]+E 



fcl' fo' + E 

log max <^ T-,7r>r~r 
ifei fe 2 K1K2 



log(24max(max ||aj|| , 1)) 

i 

+ E[241ogmax(||A|| F ,l)] 



55 ^i/4 n i/2 

< log y/n(d+ 1) + 1 + 2 log 2 + 2 + log 24 + log \/nd + 1 



(T- 



< 14 + 41ot 



nd 



a 



where the bound is derived using the same argument as in the proof of Lemma 12,1.51 



□ 



32 



4 Combining the Primal and Dual Analyses 



Proof-of-theorem \1.3.1\ Note that the transformation of each canonical form into the conic form 
leaves the Frobenius norm unchanged. Also, a random Gaussian perturbation in the original 
form maps to a random Gaussian perturbation in the conic form. Therefore, by Lemma [2.1.21 the 
smoothed bounds on the primal and dual condition numbers of the conic forms imply smoothed 
bounds on each of the condition numbers Cp \ Cp , Cp , Cp\ 



By Lemmas 12.1.41 and Lemma 13.0.21 we have that for all A, b and c satisfying 6, c||„ < 1 
and a < 1/ \/nd, 



Pr 

A,b,c 



8<j 2 



5a 2 



< Pr 

A,b,c 



cf{A,b)> 



2 12 n 2 (d + l) L5 / 2 9 n 2 (d + l) L5 \ 2 



log 



+ Pr 

A,b,c 



2 j1.5 



(<5/2)a 2 



log 



(*/2)rr> 

2 9 i// • 1 )-',/' ■ ' ., 

(<V2)a 2 



< 5/2 + <5/2 = <5. 



To bound the log of the condition number, we use Lemmas 12 . 1 . 5 1 and Lemma 13.0.21 to show 



E 

A,b,c L 



< E 

A.b,c 



log (A, b,c) 



log I C® (A, b) + C® (A, c) 



< max ( E 

i A.b.c 



log[2Cf{A,b) 



, E 

A,b.c 



log(2C«(Ac 



/ nd 

< 15 + 4.5 log ( — 

where in the second-todast inequality used that fact that for positive random variables (3 and 7, 

E [log(/3 + 7)] < max ( E [log(2/3)] , E [log(2 7 )f 



□ 
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5 Open Problems and Conclusion 



The best way to strengthen the results in this paper would be to prove that they hold under 
more restrictive models of perturbation. For example, we ask whether similar results can be 
proved if one perturbs the linear program subject to maintaining feasibility or infeasibility. This 
would be an example of a property-preserving perturbation, as defined in |ST03a| . 

A related question is whether these results can be proved under zero-preserving perturbations 
in which only non-zero entries of A are subject to perturbations. Unfortunately, the following 
example shows that in this model of zero-preserving perturbations, it is not possible to bound the 
condition number by poly(n, d, — ) with probability at least 1/2. Therefore, if such a result were 
to hold in the model of zero-preserving perturbations, it would not be because of a polynomial 
bound on the condition number. 

Let A be a zero preserving Gaussian perturbation of A with variance a 2 . For ease of exposition, 
we will normalize to be 1 at the end of formulation. Define the matrix 



A 



1 e 



where e is a parameter to be chosen later, and consider the linear program Ax > 0, x £ C 
where C = {x : x > 0}. The i th constraint of Ax > is exactly 

€X i+ l > Xi 



We apply fact IA.0.21 with c = 5 2 /a 2 assumed to be at least 6 (so that (1 — c + lnc) < — c/2). 
This yields 



Pr[K; - 1| > 6} < e 
Pr[\a i)i+ i - e\ > 6] < e 



-4(1-4+1*4) 



■!(i-4;+in4) 



(24) 
(25) 



Setting 5 = log re yields that, with probability at least 1/2, none of the events (|24j). (|25|) 
happen for any i. Assuming that none of the events (|24|). (|25|) occur, and that e > 5 (which we 
will ensure later), we have that Ax >0,x£ C is feasible, and 



x 



1 + 8 



1 + 5 



n-l 



is one such feasible solution. We also have that (e + 5)xi + \ > (1 — 5)xi for every i. Define 



AA 



e+S \n-2- 
■IS' 
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We now show that (A+AA)x >0,x£ C is infeasible, and hence p(A, C) < \\/S.A\\ F = (|^) 
To see infeasibility, note that the constraint given by the top row of (A + AA) is 



while we simultaneously have that x 2 < (fif )" 2x n- Assuming e < 1 (which we ensure later), 
this constraint is impossible to satisfy for x £ C . 



Letting e = ± and a = £ (and hence <5 = a^S) yields p(A C) = (f±f f" 2 = (2£2) 



which is exponentially small and also satisfies the requirements on e. We can upper bound 



\\A\\p by < a/"-(1 + <5) 2 + ra(e + 5) 2 < 2yjn. Thus the condition number, which is equal 

to \\A\\ F /p(A, C), is at least Sl(n)™- 3 . 



shows the impossibility of a theorem like theorem ll.3.1l for another natural model of perturbation, 
relative perturbation, that is also zero-preserving: multiplying each entry of A by an N(l,o~ 2 ) 
Gaussian random variable. This concludes our discussion of impossibility results for smoothed 
analysis. 

We would like to point out that condition numbers appear throughout Numerical Analysis and 
that condition numbers may be defined for many non-linear problems. The speed of algorithms 
for optimizing linear functions over convex bodies (including semidefinite programming) has 
been related to their condition numbers Fre02 , FVOO , and it seems that one might be able to 
extend our results to these algorithms as well. Condition numbers have also been defined for 
non- linear programming problems, and one could attempt to perform a smoothed analysis of 
non-linear optimization algorithms by relating their performance to the condition numbers of 
their inputs, and then performing a smoothed analysis of their condition numbers. 

The approach of proving smoothed complexity bounds by relating the performance of an algo- 
rithm to some property of its input, such as a condition number, and the performing a smoothed 
analysis of this quantity has also been recently used in [HTMal E5TQ2] . Finally, we hope that this 
work illuminates some of the shared interests of the Numerical Analysis, Operations Research, 
and Theoretical Computer Science communities. 




If we had normalized 1 1 ^4. 1 1 „ = 1 at the beginning of the proof, the corresponding normalization 
would have been e ~ —K=, a ~ ,} ~ which still shows the negative result. This analysis also 
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A Gaussian random variables 



We now derive particular versions of well-known bounds on the Chi-Squared distribution. These 
bounds are used in the body of the paper, and bounds of this form are well-known. We thank 
DasGupta and Gupta [DG99 for this particular derivation. 

Fact A. 0.1 (Sum of gaussians). Let X±, . . . , X^ be independent N(0,a) random variables. 
Then 

Pr[^x J 2 > K 2 ]< e i( 1 -i^ +ln i^) 
i=i 

Proof. For simplicity, we begin with Y{ ~ N(0, 1). A simple integration shows that if Y ~ A(0, 1) 
then E[e tY ] = J-^ (t < |). We proceed with 



Since 



i=l 



Pr[^ Y? - k > 0] = (for t > 0) 



i=l 



p r [ e *(E i= i^ -*) > 1] < (by Markov's Ineq.) 



E 



1 \ 



d/2 



-kt 



1 - 2t 



d/2 



d 



_k,d 

e 2 + 2 



Id, 

< (letting t = --—) 



C 2 d^ m d> 



Pv[J2Y^>k]=Pv[J2xf>a 2 k] 



i=l 



i=l 

2 „2 



we set k = and obtain e^ 1 d+ ln d) = e 2 ^ 1 A^ +ln a^** which was our desired bound. □ 



In particular, this implies: 

Fact A. 0.2 (Alternative sum of gaussians). Let X\, . . . , X& be independent N(0, a) random 
variables. Then for c > 1, 



c) 



i=l 



Corollary A. 0.3. Let x be a d-dimensional Gaussian random vector of variance a 2 centered 
at the origin. Then, for d > 2 and e < 1 /e 2 , 



Pr 



\x\\ > a^d(l + 2ln(l/e) 



< e 
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Proof. Set c = 1 + 21n(l/e) in fact IA.0.21 We then compute 

e f(l-c+lnc) < e l-c+lnc <- g -2 In i+ln(l+2 In ±) _ eg - In i+ln(l+21n ±) 

We now seek to show 

e -lni+ln(l+21ni) < j 

44> -In- +ln(l + 2 In-) < 

1 1 
44> 1 + 2 In- < - 

e e 

For e = 1/e 2 , the left-hand side of the last inequality is 5, while the right-hand side is greater 
than 7. Taking derivatives with respect to 1/e, we see that the right-hand side grows faster as 
we increase 1/e (decrease e), and therefore will always be greater. □ 

We also use the following easy-to-prove fact, a proof of which may be found in |ST03b| Propo- 
sition 2.4.7] 

Proposition A. 0.4. Let x be a d- dimensional Gaussian random vector of variance a 2 centered 
at the origin. Then, 

Pr[\\x\\ <e]< ( € "' 
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